Search CORE

7 research outputs found

Guided Deep Reinforcement Learning for Swarm Systems

Author: Hüttenrauch Maximilian
Neumann Gerhard
Šošić Adrian
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we investigate how to learn to control a group of cooperative agents with limited sensing capabilities such as robot swarms. The agents have only very basic sensor capabilities, yet in a group they can accomplish sophisticated tasks, such as distributed assembly or search and rescue tasks. Learning a policy for a group of agents is difficult due to distributed partial observability of the state. Here, we follow a guided approach where a critic has central access to the global state during learning, which simplifies the policy evaluation problem from a reinforcement learning point of view. For example, we can get the positions of all robots of the swarm using a camera image of a scene. This camera image is only available to the critic and not to the control policies of the robots. We follow an actor-critic approach, where the actors base their decisions only on locally sensed information. In contrast, the critic is learned based on the true global state. Our algorithm uses deep reinforcement learning to approximate both the Q-function and the policy. The performance of the algorithm is evaluated on two tasks with simple simulated 2D agents: 1) finding and maintaining a certain distance to each others and 2) locating a target.Comment: 15 pages, 8 figures, accepted at the AAMAS 2017 Autonomous Robots and Multirobot Systems (ARMS) Worksho

arXiv.org e-Print Archive

TUbiblio

Deep Reinforcement Learning for Swarm Systems

Author: Hüttenrauch Maximilian
Neumann Gerhard
Šošić Adrian
Publication venue
Publication date: 01/01/2019
Field of study

Recently, deep reinforcement learning (RL) methods have been applied successfully to multi-agent scenarios. Typically, these methods rely on a concatenation of agent states to represent the information content required for decentralized decision making. However, concatenation scales poorly to swarm systems with a large number of homogeneous agents as it does not exploit the fundamental properties inherent to these systems: (i) the agents in the swarm are interchangeable and (ii) the exact number of agents in the swarm is irrelevant. Therefore, we propose a new state representation for deep multi-agent RL based on mean embeddings of distributions. We treat the agents as samples of a distribution and use the empirical mean embedding as input for a decentralized policy. We define different feature spaces of the mean embedding using histograms, radial basis functions and a neural network learned end-to-end. We evaluate the representation on two well known problems from the swarm literature (rendezvous and pursuit evasion), in a globally and locally observable setup. For the local setup we furthermore introduce simple communication protocols. Of all approaches, the mean embedding representation using neural network features enables the richest information exchange between neighboring agents facilitating the development of more complex collective strategies.Comment: 31 pages, 12 figures, version 3 (published in JMLR Volume 20

arXiv.org e-Print Archive

TUbiblio

Inverse Reinforcement Learning in Swarm Systems

Author: KhudaBukhsh Wasiur R.
Koeppl Heinz
Zoubir Abdelhak M.
Šošić Adrian
Publication venue
Publication date: 01/01/2017
Field of study

Inverse reinforcement learning (IRL) has become a useful tool for learning behavioral models from demonstration data. However, IRL remains mostly unexplored for multi-agent systems. In this paper, we show how the principle of IRL can be extended to homogeneous large-scale problems, inspired by the collective swarming behavior of natural systems. In particular, we make the following contributions to the field: 1) We introduce the swarMDP framework, a sub-class of decentralized partially observable Markov decision processes endowed with a swarm characterization. 2) Exploiting the inherent homogeneity of this framework, we reduce the resulting multi-agent IRL problem to a single-agent one by proving that the agent-specific value functions in this model coincide. 3) To solve the corresponding control problem, we propose a novel heterogeneous learning scheme that is particularly tailored to the swarm setting. Results on two example systems demonstrate that our framework is able to produce meaningful local reward models from which we can replicate the observed global system dynamics.Comment: 9 pages, 8 figures; ### Version 2 ### version accepted at AAMAS 201

arXiv.org e-Print Archive

TUbiblio

Learning Models of Behavior From Demonstration and Through Interaction

Author: Šošić Adrian
Publication venue
Publication date: 01/01/2018
Field of study

This dissertation is concerned with the autonomous learning of behavioral models for sequential decision-making. It addresses both the theoretical aspects of behavioral modeling — like the learning of appropriate task representations — and the practical difficulties regarding algorithmic implementation. The first half of the dissertation deals with the problem of learning from demonstration, which consists in generalizing the behavior of an expert demonstrator based on observation data. Two alternative modeling paradigms are discussed. First, a nonparametric inference framework is developed to capture the behavior of the expert at the policy level. A key challenge in the design of the framework is the objective of making minimal assumptions about the observed behavior type while dealing with a potentially infinite number of system states. Due to the automatic adaptation of the model order to the complexity of the shown behavior, the proposed approach is able to pick up stochastic expert policies of arbitrary structure. Second, a nonparametric inverse reinforcement learning framework based on subgoal modeling is proposed, which allows to efficiently reconstruct the expert behavior at the intentional level. Other than most existing approaches, the proposed methodology naturally handles periodic tasks and situations where the intentions of the expert change over time. By adaptively decomposing the decision-making problem into a series of task-related subproblems, both inference frameworks are suitable for learning compact encodings of the expert behavior. For performance evaluation, the models are compared with existing frameworks on synthetic benchmark scenarios and real-world data recorded on a KUKA lightweight robotic arm. In the second half of the work, the focus shifts to multi-agent modeling, with the aim of analyzing the decision-making process in large-scale homogeneous agent networks. To fill the gap of decentralized system models with explicit agent homogeneity, a new class of agent systems is introduced. For this system class, the problem of inverse reinforcement learning is discussed and a meta learning algorithm is devised that makes explicit use of the system symmetries. As part of the algorithm, a heterogeneous reinforcement learning scheme is proposed for optimizing the collective behavior of the system based on the local state observations made at the agent level. Finally, to scale the simulation of the network to large agent numbers, a continuum version of the model is derived. After discussing the system components and associated optimality criteria, numerical examples of collective tasks are given that demonstrate the capabilities of the continuum approach and show its advantages over large-scale agent-based modeling

TUbiblio

tuprints

Correction to: Reinforcement learning in a continuum of agents

Author: Abdelhak M. Zoubir
Adrian Šošić
Heinz Koeppl
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Reinforcement learning in a continuum of agents

Author: A Doucet
A Dubkov
A Martinoli
Abdelhak M. Zoubir
Adrian Šošić
B Fornberg
B Houchmandzadeh
DS Bernstein
DS Dean
F Schweitzer
F Sehnke
GB Ermentrout
GM Whitesides
H Abelson
H Hamann
Heinz Koeppl
I Grondman
I Karatzas
ID Couzin
J Ohkubo
J-M Lasry
JP Crutchfield
LP Kaelbling
M Brambilla
M Land
M Sipper
MP Deisenroth
NV Krylov
P Billingsley
R Munos
RA Freitas
RJ Aumann
RS Sutton
S Ramaswamy
SV Macua
T Vicsek
V Lesser
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

14th International Conference of Archaeological Prospection

Author: Asandulesei Andrei
Asandulesei Mihaela
Balaur Radu-Stefan
Baldwin Eamonn
Banaszek Łukasz
Baret Florian
Bates Martin
Bates Richard
Becker Florian
Beenen Gerbrand
Benech Christophe
Besnier Christophe
Beylier Alexandre
Bobokhyan Arsen
Bodet Ludovic
Bornik Alexander
Burzawa Audrey
Buvat Solène
Camerlynck Christian
Casement Rebecca
Chaoui-Derieux Dorothée
Chapman Henry
Ciechowska Helena
Clément Benjamin
Conyers Lawrence B.
Coolen Joris
Dabas Michel
Dacko Marion
Dadaki Stavroula
Daire Marie-Yvane
Darras Lionel
Deberge Yann
Deforce Koen
Dell’Unto Nicolo
Delvoie Simon
De Grave Johan
De Smedt Philippe
Diamanti Nectaria
Dobrev Vassil
Donnadieu Franck
Drzewiecki Mariusz
Dziurdzik Tomasz
d’Agostino Laurent
Eitel Michael
Fassbinder Jörg W. E.
Fikos Elias
Fischer Jens
Flageul Lionel
Flageul Sébastien
Fondrillon Mélanie
Fossurier Carole
Gaffney Chris
Gaffney Vincent
Gagošidze Iulon
Gamon Martin
Garcia-Garcia Ekhine
Garwood Paul
Gavazzi Bruno
Gericke Tatjana
Girard Jean-Pierre
Gondet Sébastien
Goransson Kristian
Grison Hana
Guadagnin Rémi
Gugl Christian
Guillemoteau Julien
Gustavsen Lars
Guyot Alexandre
Görüm Tolga
Hahn Sandra
Hanssens Daan
Hascoët Marie
Haynes Ian
Herbich Tomasz
Hinterleitner Alois
Hofmann Robert
Hubert-Moy Laurence
Hulin Guillaume
Iriarte Eneko
Jordanova Neli
Kaimaris Dimitrios
Kalafatić Hrvoje
Kaniuth Kai
Karamitrou Alexandra
Kay Stephen
Kiersnowski Krzystof
Kiersnowski Krzysztof
Kinnaird Tim
Kissas Konstantinos
Kniess Rudolf
Kniess Sandra
Korobov Dmitry S.
Kunze René
Köhn Daniel
König Andreas
Küçükdemirci Melda
Křivánek Roman
Laan Walter
Labazuy Philippe
Lambers Lena
Lambert Danièle
Landeschi Giacomo
Lang Felix
Lanéelle Camille
Laurent Amélie
Lennon Marc
Linck Roland
Linford Neil
Linford Paul
Liverani Paolo
Lockyear Klaus
Louvaris Prodromos
Luong Hiep
Löcker Klaus
Malashev Vladimir Yu.
Meyer Cornelius
Mieszkowski Radoslaw
Milo Peter
Mohammadkhani Kourosh
Morelli Gianfranco
Munschy Marc
Müller Johannes
Müth Silke
Nau Erich
Neubauer Wolfgang
Norgeot Christophe
Nouveau Marie
Ohlrau René
Ohlsson Mattias
Oikonomou Dimitris
Ollivier Julien
Orbons Joep
Paez-Rezende Laurent
Papadopoulos Nikos
Parfant Côme
Parsi Mandana
Pasquet Sylvain
Payne Andrew
Perruchon-Monge Lola
Pickartz Natalie
Piro Salvatore
Piroddi Luca
Pisz Michał Sebastian
Polydoropoulos Konstantinos
Pomar Elena
Péres Thibaut
Pęski Franciszek
Rabbel Wolfgang
Radbauer Silvia
Rassmann Knut
Reiller Hugo
Rejiba Fayçal
Rousset Marie-Odile
Rusch Katharina
Ryndziewicz Robert
Sabatini Serena
Sanchez Christelle
Sarris Apostolos
Schamper Cyril
Scheiblecker Marion
Schiel Hannes
Schmidt Armin
Schmidt Volkmar
Schneidhofer Petra
Schwarz Ralf
Schweitzer Christian
Simirdanis Kleanthis
Simon François-Xavier
Sparrow Tom
Stamnes Arne Anderson
Stampolidis Alexandros
Stele Andreas
Stollnberger Astrid
Stéphan Pierre
Tabbagh Alain
Tencariu Felix-Adrian
Tencer Tomas
Thiesson Julien
Thorwart Martin
Tonning Christer
Torrejón Valdelomar Juan
Totschnig Ralf
Trausmuth Tanja
Traxler Stefan
Trinks Immo
Tsokas Gregory
Tsokas Gregory N.
Tsourlos Panagiotis
Ullrich Burkart
Vallet Régis
Vandenberghe Dimitri
van Wijk Ivo
Vargemezis George
Verdonck Lieven
Verhegge Jeroen
Verhoeven Geert
Vitale Quentin
Wallner Mario
Wiewel Adam
Wilken Dennis
Wolf Marco
Wunderlich Tina
Zacharopoulou Georgia
Zawadzki Mikolaj
Zoellner Henning
Zolchow Manuel
Özcan Orkan
Özdoğru İnci Nurgül
Šiljeg Bartul
Šošić-Klindžić Rajna
Publication venue: 'OpenEdition'
Publication date: 25/08/2021
Field of study

OpenEdition